Search CORE

MDC Repository

Protein Peeling 2: a web server to convert protein structures into series of protein units

Author: de Brevern A.G.
Etchebest C.
Gelly J.-C.
Hazout S.
Publication venue: Oxford University Press
Publication date: 01/07/2006
Field of study

Protein Peeling 2 (PP2) is a web server for the automatic identification of protein units (PUs) given the 3D coordinates of a protein. PUs are an intermediate level of protein structure description between protein domains and secondary structures. It is a new tool to better understand and analyze the organization of protein structures. PP2 uses only the matrices of protein contact probabilities and cuts the protein structures optimally using Matthews' coefficient correlation. An index assesses the compactness quality of each PU. Results are given both textually and graphically using JMol and PyMol softwares. The server can be accessed from

Aminopeptidase B, a glucagon-processing enzyme: site directed mutagenesis of the Zn2+-binding motif and molecular modelling

Author: Beinfeld Margery C
Cadel Marie-Sandrine
Etchebest Catherine
Foulon Thierry
Gouzy-Darmon Cécile
Hanquez Chantal
Nicolas Pierre
Pham Viet-Laï
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Aminopeptidase B (Ap-B; EC 3.4.11.6) catalyzes the cleavage of basic residues at the N-terminus of peptides and processes glucagon into miniglucagon. The enzyme exhibits, <it>in vitro</it>, a residual ability to hydrolyze leukotriene A4 into the pro-inflammatory lipid mediator leukotriene B4. The potential bi-functional nature of Ap-B is supported by close structural relationships with LTA4 hydrolase (LTA4H ; EC 3.3.2.6). A structure-function analysis is necessary for the detailed understanding of the enzymatic mechanisms of Ap-B and to design inhibitors, which could be used to determine the complete <it>in vivo </it>functions of the enzyme. Results The rat Ap-B cDNA was expressed in <it>E. coli </it>and the purified recombinant enzyme was characterized. 18 mutants of the H325EXXHX18E348 Zn2+-binding motif were constructed and expressed. All mutations were found to abolish the aminopeptidase activity. A multiple alignment of 500 sequences of the M1 family of aminopeptidases was performed to identify 3 sub-families of exopeptidases and to build a structural model of Ap-B using the x-ray structure of LTA4H as a template. Although the 3D structures of the two enzymes resemble each other, they differ in certain details. The role that a loop, delimiting the active center of Ap-B, plays in discriminating basic substrates, as well as the function of consensus motifs, such as RNP1 and Armadillo domain are discussed. Examination of electrostatic potentials and hydrophobic patches revealed important differences between Ap-B and LTA4H and suggests that Ap-B is involved in protein-protein interactions. Conclusion Alignment of the primary structures of the M1 family members clearly demonstrates the existence of different sub-families and highlights crucial residues in the enzymatic activity of the whole family. <it>E. coli </it>recombinant enzyme and Ap-B structural model constitute powerful tools for investigating the importance and possible roles of these conserved residues in Ap-B, LTA4H and M1 aminopeptidase catalytic sites and to gain new insight into their physiological functions. Analysis of Ap-B structural model indicates that several interactions between Ap-B and proteins can occur and suggests that endopeptidases might form a complex with Ap-B during hormone processing.</p

Springer - Publisher Connector

Quality measures for protein alignment benchmarks

Author: Altschul
Altschul
Armougom
Babon
Bahr
Barford
Blackshields
Boutonnet
Bradley
Brenner
Bullock
Colloc'h
Do
Edgar
Edgar
Etchebest
Godzik
Gough
Hasegawa
Holm
Jones
Kabsch
McClure
Mizuguchi
Murzin
Needleman
O'S
Orengo
Raghava
Robert C. Edgar
Roshan
Rost
Russell
Sauder
Schwartz
Shindyalov
Siddiqui
Subramanian
Subramanian
Taylor
Thompson
Thompson
Thompson
Van Walle
Van Walle
Yu
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Multiple protein sequence alignment methods are central to many applications in molecular biology. These methods are typically assessed on benchmark datasets including BALIBASE, OXBENCH, PREFAB and SABMARK, which are important to biologists in making informed choices between programs. In this article, annotations of domain homology and secondary structure are used to define new measures of alignment quality and are used to make the first systematic, independent evaluation of these benchmarks. These measures indicate sensitivity and specificity while avoiding the ambiguous residue correspondences and arbitrary distance cutoffs inherent to structural superpositions. Alignments by selected methods that indicate high-confidence columns (ALIGN-M, DIALIGN-T, FSA and MUSCLE) are also assessed. Fold space coverage and effective benchmark database sizes are estimated by reference to domain annotations, and significant redundancy is found in all benchmarks except SABMARK. Questionable alignments are found in all benchmarks, especially in BALIBASE where 87% of sequences have unknown structure, 20% of columns contain different folds according to SUPERFAMILY and 30% of ‘core block’ columns have conflicting secondary structure according to DSSP. A careful analysis of current protein multiple alignment benchmarks calls into question their ability to determine reliable algorithm rankings

CiteSeerX

svmPRAT: SVM-based Protein Residue Annotation Toolkit

Author: A Kernytsky
AG de Brevern
AG Murzin
AK Dunker
AR Kinjo
B Rost
C Etchebest
C Kauffman
Christopher Kauffman
DT Jones
DT Jones
G Karypis
G Pollastri
G Pollastri
GE Crooks
George Karypis
H Rangwala
Huzefa Rangwala
J Cheng
J Cheng
M Gribskov
O Noivirit-Brik
R Ahmed
R Karchin
R Sanchez
RC Whaley
S Ahmad
S Hirose
SF Altschul
T Joachims
T Schwede
V Vapnik
VN Vapnik
W Kabsch
Y Ofran
Z Dosztnyi
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Over the last decade several prediction methods have been developed for determining the structural and functional properties of individual protein residues using sequence and sequence-derived information. Most of these methods are based on support vector machines as they provide accurate and generalizable prediction models. Results We present a general purpose protein residue annotation toolkit (<it>svm</it><monospace>PRAT</monospace>) to allow biologists to formulate residue-wise prediction problems. <it>svm</it><monospace>PRAT</monospace> formulates the annotation problem as a classification or regression problem using support vector machines. One of the key features of <it>svm</it><monospace>PRAT</monospace> is its ease of use in incorporating any user-provided information in the form of feature matrices. For every residue <it>svm</it><monospace>PRAT</monospace> captures local information around the reside to create fixed length feature vectors. <it>svm</it><monospace>PRAT</monospace> implements accurate and fast kernel functions, and also introduces a flexible window-based encoding scheme that accurately captures signals and pattern for training effective predictive models. Conclusions In this work we evaluate <it>svm</it><monospace>PRAT</monospace> on several classification and regression problems including disorder prediction, residue-wise contact order estimation, DNA-binding site prediction, and local structure alphabet prediction. <it>svm</it><monospace>PRAT</monospace> has also been used for the development of state-of-the-art transmembrane helix prediction method called TOPTMH, and secondary structure prediction method called YASSPP. This toolkit developed provides practitioners an efficient and easy-to-use tool for a wide variety of annotation problems. <it>Availability</it>: <url>http://www.cs.gmu.edu/~mlbio/svmprat</url></p

Springer - Publisher Connector

Automated Alphabet Reduction for Protein Datasets

Author: AD Solis
AD Solis
AD Solis
Alfonso Valencia
AR Kinjo
B Rost
C Etchebest
C Sander
CD Livingstone
F Melo
G Harik
G Pollastri
G Venturini
J Bacardit
J Bacardit
J Bacardit
J Bacardit
J Meiler
J Mintseris
J Wang
Jaume Bacardit
JO Wrabl
Jonathan D Hirst
JY Wang
K Yue
KA Dill
KM Misura
LR Murphy
M Cieplak
M Gribskov
M Stout
Michael Stout
MJ Wood
MS Cline
N Krasnogor
Natalio Krasnogor
O Dor
Robert E Smith
S Akanuma
S Henikoff
S Kamtekar
S Kullback
S Miyazawa
S Qin
SF Altschul
T Li
T Noguchi
TM Cover
W Kabsch
X Liu
Y Ikenaka
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background We investigate automated and generic alphabet reduction techniques for protein structure prediction datasets. Reducing alphabet cardinality without losing key biochemical information opens the door to potentially faster machine learning, data mining and optimization applications in structural bioinformatics. Furthermore, reduced but informative alphabets often result in, e.g., more compact and human-friendly classification/clustering rules. In this paper we propose a robust and sophisticated alphabet reduction protocol based on mutual information and state-of-the-art optimization techniques. Results We applied this protocol to the prediction of two protein structural features: contact number and relative solvent accessibility. For both features we generated alphabets of two, three, four and five letters. The five-letter alphabets gave prediction accuracies statistically similar to that obtained using the full amino acid alphabet. Moreover, the automatically designed alphabets were compared against other reduced alphabets taken from the literature or human-designed, outperforming them. The differences between our alphabets and the alphabets taken from the literature were quantitatively analyzed. All the above process had been performed using a primary sequence representation of proteins. As a final experiment, we extrapolated the obtained five-letter alphabet to reduce a, much richer, protein representation based on evolutionary information for the prediction of the same two features. Again, the performance gap between the full representation and the reduced representation was small, showing that the results of our automated alphabet reduction protocol, even if they were obtained using a simple representation, are also able to capture the crucial information needed for state-of-the-art protein representations. Conclusion Our automated alphabet reduction protocol generates competent reduced alphabets tailored specifically for a variety of protein datasets. This process is done without any domain knowledge, using information theory metrics instead. The reduced alphabets contain some unexpected (but sound) groups of amino acids, thus suggesting new ways of interpreting the data.</p

Springer - Publisher Connector

Public Library of Science (PLOS)

UCL Discovery

Assignment of PolyProline II Conformation and Analysis of Sequence – Structure Relationship

Author: A Bornot
A Kentsis
A Rath
AA Adzhubei
AA Adzhubei
AG de Brevern
AG de Brevern
AG de Brevern
AG de Brevern
AG de Brevern
AG de Brevern
Agnel Praveen Joseph
AK Jha
Alexandre G. de Brevern
AP Joseph
AP Joseph
AW Chan
B Hess
B Offmann
B Zagrovic
BJ Stapley
BK Kay
BW Chellgren
BW Chellgren
C Etchebest
CM Venkatachalam
CY Wu
D Eisenberg
D Frishman
D van der Spoel
DA Beck
E Lindahl
E Polverini
EJ Thompson
EW Blanch
F Avbelj
F Eker
FC Bernstein
FC Peterson
FM Richards
G Darnell
G Faure
G Faure
G Labesse
G Wang
G Wang
GB Banks
GD Rose
HJC Berendsen
HM Berman
J Esque
J Makowska
J Martin
J Martin
J Martin
JC Horng
JC Kendrew
Jean-Christophe Gelly
JM Hicks
JS Richardson
JS Richardson
K Chen
L Fourrier
L Pauling
L Pauling
L Pauling
L Pauling
LL Perskie
LL Porter
LR Rabiner
M Bansal
M Dudev
M Kuemin
M Mezei
M Tyagi
M Tyagi
M Tyagi
M Tyagi
M Tyagi
MA Kelly
Markus Buehler
MB Swindells
ML Tiffany
MV Cubellis
MV Cubellis
N Colloc'h
N Sreerama
NC Fitzkee
PK Vlasov
PL Obuchowski
PM Cowan
R Berisio
R Srinivasan
RV Pappu
S Arnott
S Jun
S Kutter
SA Hollingsworth
SJ Whittington
SM King
T Kameda
T Kohonen
TP Creamer
TP Creamer
V Sasisekharan
W Kabsch
WL Jorgensen
Y Watanabe
Yohann Mansiaux
Z Liu
Z Shi
Z Shi
Z Shi
Z Shi
Z Shi
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

International audienceBACKGROUND: Secondary structures are elements of great importance in structural biology, biochemistry and bioinformatics. They are broadly composed of two repetitive structures namely α-helices and β-sheets, apart from turns, and the rest is associated to coil. These repetitive secondary structures have specific and conserved biophysical and geometric properties. PolyProline II (PPII) helix is yet another interesting repetitive structure which is less frequent and not usually associated with stabilizing interactions. Recent studies have shown that PPII frequency is higher than expected, and they could have an important role in protein - protein interactions. METHODOLOGY/PRINCIPAL FINDINGS: A major factor that limits the study of PPII is that its assignment cannot be carried out with the most commonly used secondary structure assignment methods (SSAMs). The purpose of this work is to propose a PPII assignment methodology that can be defined in the frame of DSSP secondary structure assignment. Considering the ambiguity in PPII assignments by different methods, a consensus assignment strategy was utilized. To define the most consensual rule of PPII assignment, three SSAMs that can assign PPII, were compared and analyzed. The assignment rule was defined to have a maximum coverage of all assignments made by these SSAMs. Not many constraints were added to the assignment and only PPII helices of at least 2 residues length are defined. CONCLUSIONS/SIGNIFICANCE: The simple rules designed in this study for characterizing PPII conformation, lead to the assignment of 5% of all amino as PPII. Sequence - structure relationships associated with PPII, defined by the different SSAMs, underline few striking differences. A specific study of amino acid preferences in their N and C-cap regions was carried out as their solvent accessibility and contact patterns. Thus the assignment of PPII can be coupled with DSSP and thus opens a simple way for further analysis in this field

HAL-Inserm

HAL Descartes

Hal-Diderot

Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks

Author: A.G. de Brevern
C. Etchebest
S. Hazout
Publication venue: 'Wiley'
Publication date: 01/01/2002
Field of study

Repository of Enriched Structures of Proteins Involved in the Red Blood Cell Environment (RESPIRE).

Author: C Etchebest
H Santuz
S Léonard
S Téletchéa
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

The Red Blood Cell (RBC) is a metabolically-driven cell vital for processes such a gas transport and homeostasis. RBC possesses at its surface exposing antigens proteins that are critical in blood transfusion. Due to their importance, numerous studies address the cell function as a whole but more and more details of RBC structure and protein content are now studied using massive state-of-the art characterisation techniques. Yet, the resulting information is frequently scattered in many scientific articles, in many databases and specialized web servers. To provide a more compendious view of erythrocytes and of their protein content, we developed a dedicated database called RESPIRE that aims at gathering a comprehensive and coherent ensemble of information and data about proteins in RBC. This cell-driven database lists proteins found in erythrocytes. For a given protein entry, initial data are processed from external portals and enriched by using state-of-the-art bioinformatics methods. As structural information is extremely useful to understand protein function and predict the impact of mutations, a strong effort has been put on the prediction of protein structures with a special treatment for membrane proteins. Browsing the database is available through text search for reference gene names or protein identifiers, through pre-defined queries or via hyperlinks. The RESPIRE database provides valuable information and unique annotations that should be useful to a wide audience of biologists, clinicians and structural biologists. Database URL: http://www.dsimb.inserm.fr/respire